BUPTTeam Participation in NTCIR-12 Short Text Conversation Task

نویسندگان

  • Yongmei Tan
  • Minda Wang
  • Songbo Han
چکیده

Abstract This paper provides an overview of BUPTTeam’s system participated in the Short Text Conversation (STC) task of Chinese at NTCIR-12. STC is a new NTCIR challenging task which is defined as an IR problem, i.e., retrieval based a repository of postcomment pairs from Sina Weibo. In this paper, we propose a novel method to retrieve post result from the repository based on the following four steps: 1) preprocessing, 2) building search index, 3) comment candidates generation, 4) comment candidates ranking. The evaluation results show that our method significantly outperforms state-of-the-art STC Chinese task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BUPTTeam at the NTCIR-13 STC-2 Task

This paper provides an overview of BUPTTeam’s system participated in the Short Text Conversation (STC) task of Chinese at NTICR-13. STC is a new NTCIR challenging task which is defined as an information retrieval (IR) or natural language generation problem. In this paper, we propose a novel method to generate appropriate comments based on the following four steps: 1) preprocessing, 2) model bui...

متن کامل

Overview of the NTCIR-12 Short Text Conversation Task

We describe an overview of the NTCIR-12 Short Text Conversation (STC) task, which is a new pilot task of NTCIR-12. STC consists of two subtasks: a Chinese subtask using post-comment pairs crawled from Weibo, and a Japanese subtask providing the IDs of such pairs from Twitter. Thus, the main difference between the two subtasks lies in the sources and languages of the test collections. For the Ch...

متن کامل

Nders at the NTCIR-12 STC Task: Ranking Response Messages with Mixed Similarity for Short Text Conversation

Short Text Conversation (STC) is a typical scenario in manmachine conversation, which simplifies the conversation into one round interaction and makes the related tasks more practical. This paper presents a simple approach to the Chinese STC task issued by NTCIR-12. Given a repository of post-comment pairs, for any query, we define three types of similarity and merged them according to empirica...

متن کامل

Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Text Conversation Task

According to rise of social networking services, short text like micro-blogs has become a valuable resource for practical applications. When using text data in applications, similarity estimation between text is an important process. Conventional methods have assumed that an input text is sufficiently long such that we can rely on statistical approaches, e.g., counting word occurrences. However...

متن کامل

SLSTC at the NTCIR-12 STC Task

The SLSTC team participated in the NTCIR-12 Short Text Conversation (STC)[1] task. This report describes our approach to solving the STC problem and discusses the ocial results.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016